r, even a methylation site was a DMS, but its host gene was not

ly a DEG.

ose the kth gene was the host gene of the mth DMS and the mth

s ranked at the top in the gth regression model constructed for the

The gene order distance between the gth DEG and the kth gene

ned as below,

|݃െ݇|



econd distance was called the base pair distance. The mean base-

e gth DEG was denoted by ߱ and the methylation site of the mth

e top-ranked DMS for the regression model of the gth DEG) was

by ߱, the base pair distance between them was defined as below,

ห߱߱



e gene order distances and base distances were binned into six

to study the trend of the distribution pattern. Figure 4.30 shows

order distance distributions of four models. It can be seen that all

butions had the same pattern or trend, i.e., the distributions was

skewed towards the right side favouring to the great gene

. This thus shows that the host genes of most top-ranked DMSs

away from the target DEGs. Note that a target DEG was one of

EGs.

hi-square test validated the severeness or the significance of the

in four models. All p values were extremely small. Therefore, it

doubt that the most important contributors to the differential

n profile in most DEGs were the remote methylation sites rather

l ones. This thus proved the complexity of the genetic-epigenetic

in living cells. For instance, in the Lasso model, 77% (970 of

2E models) of the target DEGs and the methylation sites of the

ed DMSs of these 970 DEGs were separated by more than 1,000

bout 34% (425 of 1,250 M2E models) of the target DEGs and the

on sites of the top-ranked DMSs of these 425 DEGs were

by more than 10,000 genes.